Skip to content

feat(keeper): implement retry logic with exponential backoff for failed submissions#1

Open
ACodehunter wants to merge 1 commit intomainfrom
feature/retry-logic-exponential-backoff
Open

feat(keeper): implement retry logic with exponential backoff for failed submissions#1
ACodehunter wants to merge 1 commit intomainfrom
feature/retry-logic-exponential-backoff

Conversation

@ACodehunter
Copy link
Copy Markdown
Owner

Implements a fault-tolerant retry mechanism for transaction submissions that handles transient network failures, RPC timeouts, and fee-related rejections without causing duplicate execution or infinite loops.

Features

  • Generic withRetry() higher-order async wrapper with configurable options
  • Exponential backoff with jitter: baseDelay * 2^attempt + random(0, baseDelay)
  • Error classification into retryable, non-retryable, and duplicate categories
  • DUPLICATE_TRANSACTION responses treated as success (no retry)
  • Comprehensive error code support for Soroban RPC responses
  • Retry count tracking and inclusion in execution logs
  • MAX_RETRIES_EXCEEDED warning event emission
  • Environment variable configuration (MAX_RETRIES, RETRY_BASE_DELAY_MS, MAX_RETRY_DELAY_MS)

Error Classifications

  • Retryable: TIMEOUT, NETWORK_ERROR, RATE_LIMITED, SERVER_ERROR, etc.
  • Non-retryable: INVALID_ARGS, INSUFFICIENT_GAS, CONTRACT_PANIC, TX_BAD_AUTH, etc.
  • Duplicate: DUPLICATE_TRANSACTION, TX_ALREADY_IN_LEDGER

Testing

  • Unit tests covering success on 2nd attempt
  • Non-retryable error bail scenarios
  • Max retries exceeded handling
  • Duplicate transaction detection
  • Exponential backoff with jitter verification
  • Network error detection

Closes SoroLabs#37

…ed submissions

Implements a fault-tolerant retry mechanism for transaction submissions that
handles transient network failures, RPC timeouts, and fee-related rejections
without causing duplicate execution or infinite loops.

Features:
- Generic withRetry() higher-order async wrapper with configurable options
- Exponential backoff with jitter: baseDelay * 2^attempt + random(0, baseDelay)
- Error classification into retryable, non-retryable, and duplicate categories
- DUPLICATE_TRANSACTION responses treated as success (no retry)
- Comprehensive error code support for Soroban RPC responses
- Retry count tracking and inclusion in execution logs
- MAX_RETRIES_EXCEEDED warning event emission
- Environment variable configuration (MAX_RETRIES, RETRY_BASE_DELAY_MS, MAX_RETRY_DELAY_MS)

Error Classifications:
- Retryable: TIMEOUT, NETWORK_ERROR, RATE_LIMITED, SERVER_ERROR, etc.
- Non-retryable: INVALID_ARGS, INSUFFICIENT_GAS, CONTRACT_PANIC, TX_BAD_AUTH, etc.
- Duplicate: DUPLICATE_TRANSACTION, TX_ALREADY_IN_LEDGER

Testing:
- Unit tests covering success on 2nd attempt
- Non-retryable error bail scenarios
- Max retries exceeded handling
- Duplicate transaction detection
- Exponential backoff with jitter verification
- Network error detection

Closes SoroLabs#37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Keeper/Backend] Implement Retry Logic with Exponential Backoff for Failed Submissions

1 participant